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Abstract 

Goal : This paper deals with the problems that some EEG signals have no good sparse representation and single 
channel processing is not computationally efficient in compressed sensing of multi-channel EEG signals. Methods: 

An optimization model with L0 norm and Schatten-0 norm is proposed to enforce cosparsity and low rank structures 
in the reconstructed multi-channel EEG signals. Both convex relaxation and global consensus optimization with 
alternating direction method of multipliers are used to compute the optimization model. Results'. The performance 
of multi-channel EEG signal reconstruction is improved in term of both accuracy and computational complexity. 
Conclusion: The proposed method is a better candidate than previous sparse signal recovery methods for compressed 
sensing of EEG signals. Significance: The proposed method enables successful compressed sensing of EEG signals 
even when the signals have no good sparse representation. Using compressed sensing would much reduce the power 
consumption of wireless EEG system. 


Index Terms 

alternating direction method of multipliers (ADMM), compressed sensing, cosparse signal recovery, low rank 
matrix recovery, multi-channel electroencephalogram (EEG). 


I. Introduction 

Wireless body sensor networks take spatially distributed sensors to acquire physiological signals, and 
transmit them over wireless links to a central unit for signal processing 0]. The electroencephalogram 
(EEG) signal is one of the most frequently used biomedical signals. It has important applications in 
medical healthcare, brain computer interfacing (BCI), and so on fl2l. Continuous EEG monitoring usually 
requires large amount of data to be sampled and transmitted, which leads to large size of batteries. The 
recording unit of the wireless portable EEG systems is powered with batteries, and the physical size of 
the batteries sets the overall device size and operational lifetime. A physically too large device would not 
be portable; and excessive battery power consumption would make the long time wireless recording very 

hard mmm. 

Compressed sensing (CS) was proposed to deal with this challenge. Rather than first sample the analog 
signal at Nyquist rate and discard most in the compression, it directly acquires the digital compressed 
measurements at a lower sampling rate, and recovers the digital signals by nonlinear algorithms from the 
compressed measurements (5). CS relies on the assumption that the signal vector x is compressed by a 
random matrix O e M M/N (measurement or sampling matrix) in discrete form as |[6l IfTTl: 

y = <£x, (l) 
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where y is the random sub-Nyquist compressed measurement. Here M <sz N, which means that it is 
sampled at a greatly reduced rate. If x is sparse, its recovery only requires the compressed signal y 
and the sampling matrix <J>. If it is not sparse, the signal x should be represented (transformed) using a 
representation matrix (dictionary) f e R NxP with N < P and a sparse vector 9 e R Px 1 with most of its 
entries zero or almost zero as: 

x = me. (2) 

With the compressed measurement y, sampling matrix <t> and dictionary T, we can recover x by ([2]) after 
computing 9 by: 

minimize ||0|| n 

6 , (3) 

subject to y = OW 

where ||0|| o is the pseudo-Co norm which counts the number of nonzero entries, i.e. ||0|| o = #{9 n ^0, n = 
1,2, • • • ,N). The signal x is called ^-sparse when the number of nonzero entries is K. Most of the current 
methods for biomedical signal recovery from compressed samples are based on the solution of the Co 
programming problem ©, such as, basis pursuit (BP), orthogonal matching pursuit (OMP), iterative hard 
thresholding (IHT), etc [01 [0 @. Besides, © found that some EEG signals are not sparse in any sparse 
transformed domains, and proposed to exploit block-sparsity by block sparse Bayesian learning (BSBL) 
to recover EEG signals Il5l . 

Contrary to the traditional sparse or block-sparse signal model, the cosparse signal model uses an 
analysis operator multiplying the measurement to produce a sparse vector IfTOl : 

P = fix, (4) 

where Q e R QxN is the cosparse representation matrix (analysis dictionary) with N < Q, and p e R exl 
is the cosparse vector if most of its entries are nearly zero. Several sufficient conditions theoretically 
guarantee the successful recovery of the cosparse signal from the compressed measurement, such as the 
restricted isometry property adapted to the dictionary (D-RIP), restricted orthogonal projection property 
(ROPP), etc IfTOl IfTTl [fl2l . When N = P, an equivalent cosparse signal model to the sparse signal model 
can be found by letting n=¥ _1 ; but there is no such an equivalent when N < P. The traditional sparse 
synthesis model puts an emphasis on the non-zeros of the sparse vector 6, but the cosparse analysis model 
draws its strength from the zeros of the analysis vector p. 

The cosparse signal recovery has some unique advantages in CS based EEG systems. First, the sparse 
signal recovery ([3]) gets the best estimate of the sparse vector 9\ but the cosparse signal recovery © gets 
the EEG signal’s best estimate directly. Second, theoretically the sparse signal recovery © requires the 
columns of the representation matrix to be incoherent, but the cosparse way © allows the coherence 
of the cosparse representation matrix B, which can result in super resolution of the EEG signal estimate 
IfTTl . Third, the EEG signal can hardly be sparsely represented ©. However, data analysis shows that the 
EEG signals are approximately piecewise linear IfTTl . as shown in Fig. |TJ which implies the signal fits the 
cosparse signal model © well with the 2nd order difference matrix as the cosparse analysis dictionary. 
Therefore, the cosparse signal recovery should be more appropriate for CS of EEG signals. 

Since nearly all types of EEG systems have multiple channels, it can be taken for granted that it is 
better to jointly process the multi-channel EEG signals. iTffl proposed to jointly process multi-channel 
EEG signals by allowing slightly different phases of the dictionaries in different channels. Another classical 
way assumes that multiple channels share a similar support of sparse vector. This generalizes the single 
measurement vector (SMV) problem straightforwardly to a multiple measurement vector (MMV) problem 
lfl5l lfT6l . IfTTl proposed to incorporate preprocessing and entropy coding in the sampling to reduce the 
redundance in correlated multi-channel signals, but the added preprocessing and encoder would increase 
the power consumption in EEG sampling fU; and the procedure can hardly be realized for analog signals, 
which implies the analog EEG signals should be sampled at Nyquist sampling rate in the beginning. To 
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compress the multi-channel EEG signals from the complete digital measurement, lUffil used a wavelet- 
based volumetric coding method, while lfT9l exploited the low rank structure in matrix/tensor form and 
achieved better performance. 

Since most of the multi-channel EEG signals are more or less correlated with each other, the low rank 
structure based compression method motivates the use of low rank data structure in CS of multi-channel 
EEG signals too. The multi-channel EEG signals are put columnwise into a matrix. Our EEG data analysis 
finds that the newly formed EEG data matrix has only a few nonzero singular values. 

In this paper, the 2nd-order difference matrix is chosen to be the cosparse analysis dictionary, which 
tries to enforce the approximate piecewise linear structure. Exploiting additionally the low rank structure, 
we can further enhance the signal recovery performance by exploiting the cosparsity of single channel 
EEG signals and the low rank property of multi-channel EEG signals simultaneously in the framework 
of multi-structure CS. The f 0 norm and Schatten-0 norm based optimization model is used to encourage 
cosparsity and low rank structure in the reconstructed signals. Two methods are proposed to solve the 
multi-criterion optimization problem. One relaxes it to a convex optimization; and the other one transforms 
it into a global consensus optimization problem. The alternating direction method of multipliers (ADMM) 
is used to solve it efficiently. The convergence and computational complexity are briefly analyzed. In 
numerical experiments, a group of real-life EEG data is used to test the algorithms’ performance of both 
single-channel and multi-channel EEG signal recovery methods. Numerical results show that the cosparse 
signal recovery method and simultaneous cosparsity and low-rank (SCLR) optimization achieve the best 
performance in term of mean squared error (MSE) and mean cross-correlation (MCC) in single channel 
and multi-channel EEG signal recovery respectively. 

The rest of the paper is organized as follows. Section |TT] presents an optimization model to exploit 
both cosparsity and low rank data structures to recover the EEG signals. In Section |TTll two methods 
are given to solve the optimization problem, i.e. convex relaxation and alternating direction method of 
multipliers (ADMM). In Section QA] numerical experiments are used to demonstrate the proposed methods’ 
performance improvement. Section [V] draws the conclusion. 


II. Simultaneous Cosparsity and Low Rank Optimization Model 
The optimization model for cosparse signal recovery can be formulated as IflOl : 

minimize ||fix|| 0 

x • (5) 

subject to y = <J>x 

Here we call ([5]) the analysis LO optimization. When the EEG system records R channels simultaneously, 
the extension of analysis LO optimization to multi-channel data is: 


minimize ||vec (QX)|| 0 
subject to Y = ®X 


( 6 ) 


where X e R NxR , and vec(X) puts all the columns of X into one column vector sequentially. A series of 
solvers are summarized in IflOl . 

Reconstructing the EEG matrix from the compressed measurements by exploiting the low rank structure 
can be formulated as: 

minimize | |X|| Schatten _ 0 

subject to Y=OX 


where ||X|| Schatten _ 0 is the Schatten-0 norm which counts the number of the nonzero singular values of X 
Itm A variety of methods to solve it can be found in If22l . 

Motivated by the fact that many EEG signals have both cosparsity and low rank structure, we propose 
to simultaneously exploit these two data structures in multi-channel EEG signal reconstruction from the 
compressed measurement. Both £ () norm and Schatten-0 norm based constraints are used in the optimization 
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model. Combining with the linear data fitting constraint, we can formulate the simultaneous cosparsity 
and low rank (SCLR) optimization model as follows: 


minimize || vec (ftX)|| 0 + l|X|| Schatten _ 0 
subject to Y=tDX 


(8) 


III. SOLUTIONS 

A. Convex relaxation 

To solve the SCLR optimization ([8]), one classical way relaxes the nonconvex £ {) norm and Schatten-0 
norm into convex t\ norm and Schatten-1 norm respectively, where the fj norm sums all the absolute 
values of the entries, i.e. Hxllj = 2«=i \x n \- The Schatten-1 norm is called nuclear norm too, and sums all 
the singular values of the data matrix, i.e. ||X|| Schatten _ 1 = ||X||* = cr„. The newly formed convex 

simultaneous cosparsity and low rank (CSCLR) optimization model can be formulated as: 

minimize ||vec (flX)||, + ||X||* 

x . (9) 

subject to Y=OX 

Similarly to the reformulation from minimizc x 11x11, to minimizc xe>0 l 7 e, subject to - e < x < e due 
to the definition of the l\ norm, we can re-formulate the ('\ norm minimization into its equivalent linear 
programming in ([9]) [|23l By introduction of new nonnegative variables e and/, © can be expressed as: 

minimize l 7 e + f 

X,e>0, />0 

subject to Y=<I>X qq\ 

IIX||. < / ’ 

-e < vec (QX) < e 


where 1 e R e7?xl is a column vector with all the entries being 1. 

The nuclear norm constraint can be replaced by its linear matrix inequality (LMI) equivalent; and the 
approximation constraints can also be expressed via LMIs using Schur complements Il24l . The obtained 
optimization model is: 

minimize l 7 e + 2f 

X,e>0./>0 


subject to Y=<DX 
-e -< vec (SIX) < e 


>0 


A X 
X r B 

Tr(A) +Tr (B) < / 


(ID 


where A = A 7 and B = B 7 are new variables. (fTTl) is a semi-definite programming (SDP) which can be 
solved by interior-point method 11231 |f2D . The software CVX can compute the solution in this way [f25fl . 


B. ADMM 

Besides the classical SDR another method, called ADMM, can be used to solve the SCLR optimization 
[f26l . With individual constraints on the same variables in each constraint, © can be rewritten into a 
global consensus optimization with local variables X ( , i = 1,2 and a common global variable X as: 

minimize ||vec(fiX 1 )|| 1 + ||X 2 ||* 

x i,x 2 , x . (12) 

subject to X=X,; X=X 2 ; Y = <DX 

Here the new constraints are that all the local variables should be equal. It is equivalent to: 

minimize ||vec(QX 1 )|| 1 + ||X 2 ||* 

x h x 2 ,x _ _ _ _ ( 13 ) 

subject to Y=<DX i; Y=<DX 2 
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where 



(14) 


® = 


I 


(15) 


The corresponding augmented Lagrangian of (fl3l) is: 

MXt,X 2 ; Z ls Z 2 ) = ||vec(nX 1 )|| 1 + ||X 2 |L 


+vec(Z 1 ) r vec (y - + vec(Z 2 ) r vec (Y - OX 2 ) , 


(16) 


+ fll Y -°x 1 || F + f ||y-ox 2 


x; +1 := argmin(||vec(£lX 1 )|| 1 +^||Y-d)X 1 +®U f ! 




where p > 0, Zj and Z 2 are dual variables. The resulting ADMM algorithm in the scaled dual form is 
the following 

(17) 

(18) 
(19) 


x 2 


P Ik", 


X 2 := arg min ||X 2 ||, + £ Y - OX 2 + 


2|| F 


X f+1 = ±(X‘ +1 +X‘ +1 ), 


u* +1 = u' + (x; +1 - x') 
u 2 +I = in + (x; +1 - X*) ’ 


( 20 ) 


where Ui = 1/pZj and U 2 = l/pZ 2 are scaled dual variables. In the proposed ADMM algorithm for SCLR 
optimization, two steps separately optimize over variables generally, i.e. updating the prime variables Xi 
and X 2 , updating the scaled dual variables Ui and U 2 . In this iterative algorithm, the variables are updated 
in an alternating fashion. 

For both (fT71) and (fl8T) . there are many computationally efficient algorithms IflOl lf22l . For example, anal¬ 
ysis LI optimization, greedy analysis pursuit (GAP) can be used to solve ( [T7T) : to solve (fl8T) . SDP method 
or singular value thresholding (SVT) can be used. The solutions of (fl9l) and (l20l) are straightforwardly 
easy. The ADMM for SCLR optimization is summarized in Algorithm 1. 


Algorithm 1: ADMM for the SCLR optimization 


x, p; 


• Set t := 0. a small scalar 77 > 0, U°, U ( ), X°, T, 

repeat 

• step 1: update of the analysis LI optimization: X ( +1 := argminl||vec(£2X 

x, V 

•step 2: update of the low rank optimization: X ' 2 41 := arg min I 

x 2 v 

•step 3: update of the global variable: X ,+1 = ((X '’ 1 + X 2 +1 ); 

U ' +1 = U', +(x ' +1 -X') 

•step 4: update ol the dual variables: . } , , < ; 

U 2 = U 2 + ( X 2 - X l) 

• step 5: set t := t + 1; 

until < n or t- T 

unul ||X'+i|| F ||X'|| f s 77 or t - i n 

• Algorithm ends and return X 


tillt +f IlY-OX, +OU 1 
arg min (||X 2 |U + § ||Y - OX 2 + 


max? 
t +1 


illl); 


A lot of convergence results exist for ADMM in the literature ll26ll . Generally, the convergence to 
optimum can be guaranteed when the epigraph of y,: 
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epig,- = {(X, e) | gi (X) < s, i= 1,2.} (21) 

is a closed nonempty convex set, where gi(X) = ||vec (i2X)|| l5 g 2 (X) = ||X|| t , and the unaugmented 
Lagrangian 

L p=0 (X 1 ,X 2 ;Z 1 ,Z 2 ) = || V ec(nX 1 )|| 1 + ||X 2 |L 

+vec(Z 1 ) T vec (Y - OXj) + vec(Z 2 ) r vec (y - OX 2 ) (22) 

has a saddle point. The proof can be found in [|27ll . 

The ADMM decomposes the optimization model with multiple constraints into several ones with fewer 
constraints. There could be some fast algorithms for these new optimization models. Besides, it allows 
multiple steps in one iteration to be processed in parallel. With a multi-core processor, the computational 
time can be decreased. Previous experience shows that a few iterations will often produce acceptable 
results of practical use. 


IV. Numerical Experiments 

To demonstrate the performance of the possible methods for EEG signal recovery from the compressed 
measurement, we perform two groups of numerical experiments. The details about the data materials and 
subjects are given in section ITV-A1 In section ITV-B1 we test the performance of two cosparse signal recovery 
methods for single-channel EEG signals in different kinds of situations, i.e. analysis LI optimization and 
GAP. Some other algorithms are tested to make comparison, such as BSBL which is reported to be the 
best of all the current candidates for EEG signal recovery from compressed measurement [|5]|, and OMP 
which is a proper representative of the classical sparse signal recovery algorithms [Q. In section IIV-C1 a 
group of multi-channel EEG signals are recovered by the proposed algorithms for SCLR optimization, as 
well as simultaneous orthogonal matching pursuit (SOMP) If28l . BSBL [|5l lfl6l . and simultaneous greedy 
analysis pursuit (SGAP) f[29H . 

In all experiments, as argued by our analysis in section HI the 2nd-order difference matrix is chosen 
to be the analysis dictionary for cosparse EEG signal recovery. The Gaussian matrix is chosen to be 
the sampling matrix for CS of EEG signals. The sparse dictionaries of OMP and SOMP are Daubechies 
wavelets m. 

To measure the compression degree, the subsampling ratio (SSR) is defined as: 

SSR = ^ x 100%. (23) 

To quantify the difference between high-dimensional values implied by the estimator and the true values 
of the quantity being estimated, two different evaluation functions are often used in EEG signal processing. 
One is the mean squared error (MSE) which measures the average of the squares of the errors. The error 
is the amount by which the value implied by the estimator differs from the quantity to be estimated. Here 
we can formulate it as: 


Y X '~ X F 

MSE = Y --iL, (24) 

Tt LNR 

where X is the true EEG data with R channels and each channel has length N, X; is its estimate in the Z-th 
experiment, and L is the number of experiments. Both X and X/ are normalized by their Frobenius norms 
respectively. When R = 1, the matrix X is degenerated into a vector x. In that case, MSE can be used to 
evaluate single-channel EEG signal reconstruction evaluation. The MSE has variants of other equivalent 
forms, such as mean L2 error If30l , percent of root-mean-square difference (PRD) [@]. 
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Another evaluation function is the mean cross-correlation (MCC). It is equivalent to the Structural 
SIMilarity index (SSIM), which measures the similarity of two waveforms PTl 01 Q. It can be formulated 
as: 

vec(X) r vec (X,) 


MCC 


L 

I 

1=1 


£||X|| f X, 


(25) 


A. Data material and subjects 

The used EEG data is the CHB-MIT scalp EEG database which is online available in the Phys- 
iobank database: http://www.physionet.org/cgi-bin/atm/ATM 021 Il33l . Collected at the Children’s Hos¬ 
pital Boston, these EEG recordings are from pediatric subjects with intractable seizures. Subjects were 
monitored without anti-seizure medication in order to characterize their seizures and assess their candidacy 
for surgical intervention. All the recordings were collected from 22 subjects (5 males, ages 3-22; and 17 
females, ages 1.5-19). All used datasets consist of 23-channel EEG recordings which were sampled at 
256 samples per second with 16-bit resolution. The international 10-20 system of EEG electrode positions 
and nomenclature was used for these recordings. More details about the EEG database can be found 021 . 
In our experiments, the EEG recording chb0l_3l.edf has been selected to demonstrate the recovery 
algorithms’ performance. 

In section ITV-Bl L = 500 segments of EEG data are used, i.e. X/ 6 ]R A ' xl , / = 1,2, • • ■, L. They are taken 
from all the R= 23 channels sequentially. The length of each segment of the EEG data x is N = 256. Each 
segment of EEG data is normalized by its G norm. 

In section HV-Cl L = 50 segments of 23-channel EEG data are used, i.e. X/ e R NxR , l = 1,2, ■ ■ ■ ,L. 
In each segment of the EEG data matrix X, the number of sampling points is N x R = 256 x 23. Each 
segment of EEG data is normalized by its Frobenius norm. 

B. Single channel EEG signal recovery 

To show how the proposed cosparse signal recovery methods work, we take a segment of single channel 
EEG signal and reconstruct it from the compressed measurement with SSR = 0.35. The reconstructed and 
real signals are shown in Fig. H] We can see that the reconstructed signals from GAP and the analysis LI 
optimization methods fit the real signal better than those from the classical OMP and BSBL methods. 






Fig. 1: EEG signals reconstructed by OMP, BSBL, GAP, and analysis LI optimization with SSR = 0.35. 

Fig. [2al and Fig. [2b] give the values of MSE and MCC of GAP, OMP, BSBL with different SSRs. 
We can see that analysis LI optimization, GAP and BSBL have similar accuracy performance, and they 
outperform OMP. Analysis LI optimization is slightly more accurate than GAP, and GAP is slightly more 
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accurate than BSBL. Fig. [2c] shows that the greedy algorithms GAP and OMP are much faster than BSBL 
and analysis LI optimization, and GAP is even slightly faster than OMP Therefore, if we only care about 
the accuracy, the analysis LI optimization is the best choice; and if both accuracy and computational 
complexity are important, GAP should be a better choice. 


C. Multi-channel EEG signal recovery 

In these experiments, most of the parameters are selected as in section IIV-BI Two algorithms for 
SCLR optimization are used, i.e. interior point method for SCLR optimization and ADMM for SCLR 
optimization with experienced choices of the parameters T max = 5, p = 1, ij = 0.05. In comparison with 
the proposed methods, 3 other popular multi-channel sparse / cosparse signal recovery methods are taken 
too. i.e. BSBL, SOMP and SGAP 

Fig. [Taj [3b] and Qc] display the values of MSE, MCC and CPU times of the interior point method for the 
SCLR optimization, ADMM for SCLR optimization, BSBL, SOMP, and SGAP with different values of 
SSR. We can see that the interior point method for SCLR optimization, ADMM for SCLR optimization 
have similar accuracy performance, and they outperform the other ones in accuracy. Comparing the speed 
of these two solutions for SCLR optimization, the ADMM for SCLR optimization is faster. In Fig. I3c1 we 
can see that the greedy algorithms SOMP and SGAP are much faster than the rest. But their accuracy 
is much worse and not acceptable. Therefore, we recommend that the ADMM for SCLR optimization 
should be a better candidate for multi-channel EEG signal recovery than the other methods. 

V. Conclusion 

With the 2nd-order difference matrix as the cosparse analysis dictionary, the EEG signals’ cosparsity is 
exploited for the single-channel EEG signal recovery from compressed measurements. To further enhance 
the performance, cosparsity and low rank structure are jointly used in the multi-channel EEG signal 
recovery. In the proposed new optimization model, the £ 0 norm constraint is used to encourage cosparsity 
while Schatten-0 norm constraint is used for low rank structure. To solve the optimization model, two 
methods are used. One approximates it by relaxing the £ {) and Schatten-0 norms into t\ norm and nuclear 
norm respectively, which leads to a convex optimization. The other way is ADMM which divides the 
multiple criterion optimization into several connected single criterion optimizations in the form of global 
consensus optimization. Each single criterion optimization can be solved by a series of existing efficient 
methods. In numerical experiments, EEG signals’ cosparsity for CS is proved by the single-channel EEG 
data based results; and the multi-channel EEG data results show that the SCLR optimization outperforms 
all the previous methods. 
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vs SSR; (b) MCC vs SSR; (c) CPU Time vs SSR. 
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